NIST and NFI-TNO evaluations of automatic speaker recognition

نویسندگان

  • David A. van Leeuwen
  • Alvin F. Martin
  • Mark A. Przybocki
  • Jos S. Bouten
چکیده

In the past years, several text-independent speaker recognition evaluation campaigns have taken place. This paper reports on results of the NIST evaluation of 2004 and the NFI-TNO forensic speaker recognition evaluation held in 2003, and reflects on the history of the evaluation campaigns. The effects of speech duration, training handsets, transmission type, and gender mix show expected behaviour on the DET curves. New results on the influence of language show an interesting dependence of the DET curves on the accent of speakers. We also report on a number of statistical analysis techniques that have recently been introduced in the speaker recognition community, as well as a new application of the analysis of deviance analysis. These techniques are used to determine that the two evaluations held in 2003, by NIST and NFITNO, are of statistically different difficulty to the speaker recognition systems. 2005 Published by Elsevier Ltd. 0885-2308/$ see front matter 2005 Published by Elsevier Ltd. doi:10.1016/j.csl.2005.07.001 * Corresponding author. Tel.: +31 346 356 235; Fax: +31 346 353 977. E-mail addresses: [email protected] (D.A. van Leeuwen), [email protected] (A.F. Martin), [email protected] (M.A. Przybocki), [email protected] (J.S. Bouten). D.A. van Leeuwen et al. / Computer Speech and Language 20 (2006) 128–158 129

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The WCL-1 System in the 2003 NIST Speaker Recognition Evaluation and 2003 NFI/TNO Forensic Speaker Recognition Evaluation

In the present work we discuss the results, which our speaker verification system, WCL-1, obtained in the 2003 NFI/TNO Forensic Speaker Recognition Evaluation. These results, together with the ones obtained in the 2003 NIST Speaker Recognition Evaluation, give opportunity for in depth analysis of the various aspects of real-world application of the speaker recognition technology. Based on the d...

متن کامل

Fusing discriminative and generative methods for speaker recognition: experiments on switchboard and NFI/TNO field data

Discriminatively trained support vector machines have recently been introduced as a novel approach to speaker recognition. Support vector machines (SVMs) have a distinctly different modeling strategy in the speaker recognition problem. The standard Gaussian mixture model (GMM) approach focuses on modeling the probability density of the speaker and the background (a generative approach). In cont...

متن کامل

Evaluating Automatic Speaker Recognition systems: An overview of the NIST Speaker Recognition Evaluations (1996-2014); Evaluando los sistemas automáticos de reconocimiento de locutor: Panorama de las evaluaciones NIST de reconocimiento de locutor (1996-2014)

Automatic Speaker Recognition systems show interesting properties, such as speed of processing or repeatability of results, in contrast to speaker recognition by humans. But they will be usable just if they are reliable. Testability, or the ability to extensively evaluate the goodness of the speaker detector decisions, becomes then critical. In the last 20 years, the US National Institute of St...

متن کامل

On robust estimation of likelihood ratios: the ATVS-UPM system at 2003 NFI/TNO forensic evaluation

This paper summarizes the different algorithms developed in ATVS-UPM in order to submit a reliable Likelihood Ratio based forensic system, fully compliant with the bayesian framework for the analysis of forensic evidences, to 2003 NFI-TNO Forensic Speaker Recognition Evaluation. Once identified the main causes and consequences of the erratic estimation of Likelihood Ratios due to forensic condi...

متن کامل

Human Assisted Speaker Recognition In NIST SRE10

The NIST series of Speaker Recognition Evaluations (SRE’s) have, since 1996, evaluated automatic systems for speaker recognition. The 2010 evaluation (SRE10) also included a test of Human Assisted Speaker Recognition (HASR), in which systems based, in whole or in part, on human expertise were evaluated. Participants were invited to complete the trials in one of two small subsets of the full set...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computer Speech & Language

دوره 20  شماره 

صفحات  -

تاریخ انتشار 2006